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ANALYSIS 


The structure, function and evolution 
of proteins that bind DNA and RNA 


William H. Hudson and Eric A. Ortlund 


Proteins that bind DNA or RNA are often considered 
and studied independently of one another. For example, 
transcription factors are usually modelled relatively sim- 
ply: they bind to genomic promoters and control target 
gene expression by activating or repressing RNA poly- 
merases. Following transcription, RNA-binding proteins 
modulate protein expression by regulating the stability 
and translation of mRNAs. However, the consideration 
of DNA- and RNA-binding functions within proteins as 
separate entities is becoming outdated. The unappreci- 
ated dual DNA- and RNA-binding capacity of a grow- 
ing body of proteins plays a key part in modulating gene 
expression, cell survival and homeostasis. Recent studies 
have demonstrated that many transcription factors are 
capable of binding diverse types of RNA, which enables 
them to bind to the mRNA products of transcription to 
regulate their turnover and to integrate other signals, such 
as responses to stress’. Additionally, the prevalence and 
emerging functions of long non-coding RNAs (IncRNAs) 
have revealed that non-coding RNAs target many types of 
proteins through direct interactions'*"". 

In this Analysis, we enumerate these DNA- and 
RNA-binding proteins (DRBPs) and describe their func- 
tions, structures and evolution. We first broadly discuss 
the prevalence of DRBPs within the human genome. 
We highlight known functions of DRBPs with specific 
examples of how the simultaneous and serial RNA and 
DNA interaction allows better gene targeting, finer con- 
trol of gene expression and integration of metabolic state 
or stress to modulate protein activity. We discuss the 
structural features of DRBPs that enable dual nucleic 
acid specificity, focusing on the limited number of solved 


Abstract | Proteins that bind both DNA and RNA typify the ability of a single gene product 
to perform multiple functions. Such DNA- and RNA-binding proteins (DRBPs) have unique 
functional characteristics that stem from their specific structural features; these developed 
early in evolution and are widely conserved. Proteins that bind RNA have typically been 
considered as functionally distinct from proteins that bind DNA and studied independently. 
This practice is becoming outdated, in partly owing to the discovery of long non-coding 
RNAs (IncRNAs) that target DNA-binding proteins. Consequently, DRBPs were found to 
regulate many cellular processes, including transcription, translation, gene silencing, 
microRNA biogenesis and telomere maintenance. 


structures that allow direct comparison of a DRBP com- 
plexed with either DNA or RNA. Finally, we discuss 
the evolution of dual DNA- and RNA-binding domains 
within DRBPs, including ancient domains, for which dual 
DNA and RNA binding conferred a selective advantage, 
and more modern domains, which have recently been 
targeted by rapidly evolving IncRNAs. 


Defining DRBPs 

Defining the subset of human proteins that bind both 
DNA and RNA is a difficult task. Using gene ontology 
searches, only 64 human protein-coding genes in the 
QuickGO gene ontology database’ (European Bio- 
informatics Institute) are identified as having direct and 
specific experimental evidence for both RNA binding 
(GO:0003723) and DNA binding (GO:0003677) (FIG. 1a). 
The PROTEOME database (BioBase) returns 122 
such proteins, although direct evidence is lacking for 
many of them. 

An alternative approach involves combining evidence 
from studies that have separately attempted to catalogue 
all human proteins that bind DNA or RNA (FIG. 1b). 
A study using protein microarrays and bioinformatic 
approaches identified >4,000 human proteins that directly 
interact with double-stranded DNA (dsDNA) in vitro”. 
Gene ontology analysis of these proteins reveals that 
the group is highly enriched for the term “RNA bind- 
ing” (P< 1x10), indicating that RNA binding may 
be a common feature of DNA-binding proteins (FIG. 1c). 
Among these dsDNA-binding proteins, the ontology term 
“dsRNA binding” is much more represented than “ssRNA 
binding” (single-stranded RNA binding). 
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Figure 1 | Defining human DRBPs. a| Venn diagram shows DNA-binding proteins and RNA-binding proteins in the 
QuickGO database supported by low-throughput experimental evidence (as of July 2014)”. The overlap of these two sets 
represents human DNA- and RNA-binding proteins (DRBPs), which consists of 64 proteins. b | Venn diagram shows 
DNA-binding proteins and RNA-binding proteins identified in high-throughput studies defining the human mRNA and 
double-stranded DNA (dsDNA) interactomes'*"*. There are 407 proteins found in both studies, indicating that they may 
bind both mRNA and dsDNA. In parts a and b, circles are drawn to scale. c | Molecular function gene ontology analysis 
reveals that RNA binding is a potentially major function of the dsDNA-binding proteins identified in REF. 13.d| Gene 
ontology analysis reveals that DNA binding is potentially a major function of the mRNA-binding proteins identified in 

REF. 14. In parts c and d, only selected molecular function attributes are shown for brevity. P values in parts c and d indicate 
the probability that the over-representation of the stated ontology term in the selected 407 genes compared with all 
human genes is due to chance. These were calculated in the TRANSFAC + PROTEOME database (BioBase) using the 
hypergeometric distribution; “very large” indicates a P value of <1 x 10* (og(P) >40). rDNA, ribosomal DNA; 


ssDNA, single-stranded DNA. 


Another study used a crosslinking- and mass 
spectrometry-based approach to identify 860 mRNA- 
binding proteins from HeLa cells termed the mRNA 
interactome™. Functional analysis of these proteins 
again indicates that dual nucleic acid binding is a wide- 
spread phenomenon (FIG. 1d), as they are significantly 
enriched for both ssDNA binding and dsDNA binding 
(P=8.9x 10! and P=5.5x 10°, respectively). Notably, 
of the 860 proteins identified as mRNA binding, 407 
(47.3%) were independently characterized as dsDNA 
binding in REF. 2. Together, the two studies indicate that 
DRBPs are widespread, perhaps constituting 2% of the 
human proteome (407 of ~20,300 proteins, FIG. 1b). This 
number would probably increase if the studies included 
proteins that are expressed in other cell types, proteins 
that require ligand binding-dependent signals for nucleic 
acid binding, or proteins that bind other types of DNA 
or RNA. 


We note that many of the proteins identified in 
REFS 13,14 as DNA and/or RNA binding lack corrobo- 
rating evidence from other studies, and these findings 
should thus be interpreted with caution. For example, 
many identified proteins, such as polymerase subunits, 
may bind nucleic acid-bound proteins without binding 
DNA and/or RNA directly. Additionally, many proteins 
that bind DNA or RNA in vitro may not bind them 
in vivo. However, the two studies provide a reasonable 
estimate of potential human DRBPs owing to their wide 
coverage of the human proteome, and we discuss below 
examples in which the demonstration of protein—-nucleic 
acid binding in vitro has preceded the discovery of such 
binding in vivo, sometimes by decades. 

In Supplementary information S1 (table), we provide 
a detailed list of 149 human DRBPs, with comments 
on their nucleic acid-binding properties, structures 
and functions. These proteins were selected based 
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Figure 2 | Functional and structural properties of DRBPs. The 149 DNA- and 
RNA-binding proteins (DRBPs; see Supplementary information S1 (table)) were subjected 
to gene ontology enrichment of biological process (PROTEOME database; BioBase) 

and to INTERPRO domain enrichment (DAVID ontology’””*) in order to explore the 
biological functions of and protein domains commonly found in DRBPs. a | In gene 
ontology analysis, biological processes such as transcriptional regulation and mRNA 
processing are expectedly prominent terms found to be enriched in DRBPs. However, 
unexpected functions are also enriched, including response to many cellular stresses 
(such as heat, viral infection and radiation). For brevity, only selected functions are 
shown. b | In domain enrichment analysis, all domains enriched in the set of 149 DRBPs 
that have P values <10” are shown. P values in parts a and b indicate the probability that 
the over-representation of the stated term in the 149 DRBPs compared with all human 
genes is due to chance. RNP1, ribonucleoprotein 1 


on experimental evidence demonstrating their abil- 
ity to bind directly to both DNA and RNA, generally 
obtained from studies using more traditional experi- 
mental approaches than the high-throughput studies'*"* 
discussed above. Although many of the proteins in 
Supplementary information S1 (table) have only been 
shown to bind DNA and/or RNA in vitro, the remainder 
of this Analysis focuses on selected human DRBPs with 
known cellular roles. 
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Functions of DRBPs 
We carried out gene ontology and domain enrichment 
analyses (FIG. 2) to illuminate the main biological func- 
tions of our list of human DRBPs (see Supplementary 
information S1 (table)). The gene ontology analysis 
revealed expected biological processes, such as tran- 
scriptional regulation, mRNA processing and DNA 
replication. However, several surprising functions are 
also implicated, including the DNA-damage response, 
apoptosis and responses to extreme temperatures (FIG. 2a). 
Ultimately, DRBP functions are governed by their 
inherent structural and biochemical properties. One can 
envision DRBPs capitalizing on both RNA and DNA 
binding in numerous ways; for example, a transcription 
factor that binds DNA and RNA may interact orthogo- 
nally with RNAs that compete with DNA binding to 
repress transcription, or simultaneously with a promoter 
and an RNA co-activator to upregulate transcription. 
The following section focuses on DRBPs that bind DNA 
and RNA competitively (FIG. 3a). 


Binding DNA or decoy RNAs. The role of certain ncRNAs 
as decoys of genomic DNA is illustrated by the reduction 
in promoter occupancy by transcription factors, typically 
measured by chromatin immunoprecipitation (ChIP), in 
response to the overexpression of competing decoy RNAs. 
The glucocorticoid receptor (GCR), a steroid hormone 
receptor, is a classic example of a ligand-activated tran- 
scription factor (reviewed in REF. 15). In its inactivated 
state, the GCR is kept in the cytoplasm by chaperone pro- 
teins. Upon ligand binding, the GCR translocates to the 
nucleus, where it can bind to the promoters, and regulate 
the transcription, of hundreds of genes'®. Given the anti- 
inflammatory role of the GCR, much effort has been put 
into developing modulators of GCR-driven transcrip- 
tion”. Recently, the ncRNA growth arrest-specific 5 
(GASS) was found to inhibit the transcriptional activity 
of the GCR by competing directly with DNA for protein 
binding in vitro and in cells'; overexpression of GAS5 
leads to a decrease in ChIP-detected GCR occupancy at 
its target promoters, as well as a decrease in the mRNA 
levels of glucocorticoid-activated genes’. As cellular 
GASS levels are regulated by nonsense-mediated decay’® 
in response to serum starvation and other stressors', the 
transcriptional activity of the GCR is tuned by titrating 
the levels of GAS5 against the fixed number of genomic 
GCR-binding sites in response to cellular stress. Three 
closely related steroid receptors that share the DNA 
specificity of the GCR — the androgen, progesterone 
and mineralocorticoid receptors — are also susceptible 
to GAS5-mediated transcriptional repression’. Although 
steroid receptors have traditionally been thought of as 
DNA-binding proteins, the affinity of the GCR for RNA 
and DNA is similar, as measured in vitro by glutathione 
S-transferase pulldown assays and fluorescence-based 
competition assays'. The most distantly related member 
of the steroid receptor family, the oestrogen receptor, 
does not share the DNA specificity of the GCR and is not 
susceptible to GAS5-mediated transcriptional repression, 
indicating that the binding of steroid receptors to RNA is 
sequence specific’. 
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Figure 3 | Three archetypes of DRBP function. a| RNA can compete with DNA for 
binding to DNA- and RNA-binding proteins (DRBPs), typically at the same protein 
interface. In the case of transcription factors, this can reduce promoter occupancy and 
the transcription of target genes. b | DRBPs can regulate gene expression at multiple 
levels. In addition to binding to the promoters of genes to regulate their transcription, 
DRBPs can also affect microRNA (miRNA) processing, as well as mRNA stability and 
translation. c | DRBPs can bind DNA and RNA simultaneously, allowing the RNA to 
function as a scaffold to recruit other proteins to a specific DNA locus. Shown here is 
steroidogenic factor 1 (SF1) binding to the long non-coding RNA (IncRNA) steroid 
receptor RNA activator (SRA) to recruit the SRC1 (also known as NCOA1) transcriptional 
complex in a ligand-independent manner. NF, nuclear factor; NF90, NF of activated 

T cells 90 kDa; pre-miRNA, precursor miRNA; pri-miRNA, primary miRNA. 


Additional examples of pairs of transcription factors 
and decoy RNAs are nuclear factor-Y (NF-Y), which 
binds the IncRNA P21-associated ncRNA DNA-damage 
activated (PANDA)*, and NF-«B, which binds the mouse 
pseudogene-derived RNA Lethe’. The dual nucleic acid- 
binding activity of NF-KB was demonstrated in vitro 


Rossmann fold 

A common protein-folding 
pattern that contains the 
topology of a B-a-f fold. It is 
found in many nucleotide- 
binding proteins. 


many years before the discovery of an endogenous 
RNA target”, suggesting that transcription factors that 
are known DRBPs in vitro may have endogenous RNA 
targets awaiting discovery; an example of such DRBPs 
is the acute myeloid leukaemia 1 (AML1; also known 
as RUNX1) protein”. Although structural information 
on the interaction of human proteins with their decoy 
RNAs is lacking, a recent study demonstrated an elabo- 
rate mechanism of an analogous bacterial system: the 
sequestration of ribosomal RNA small subunit methyl- 
transferase E (RsmE) by the non-coding RNA RsmZ". 
Competitive DNA and RNA binding is a feature not only 
of transcription factors but also of nucleic acid-modifying 
enzymes, such as DNA methyltransferases. In humans, 
DNA methylation is initiated by DNA (cytosine-5)- 
methyltransferase 3A (DNMT3A) and DNMT3B; 
DNMT1 maintains this methylation by binding to hemi- 
methylated DNA after replication (reviewed in REF. 20). 
RNA binding can inhibit the DNA-binding and methyla- 
tion activity of both DNMT3A” and DNMT1 (REF. 22). 
In vitro, DNMT1 binds RNA with a higher affinity than 
DNA, as shown in electrophoretic mobility shift assays 
(EMSA)”". In the case of DNMT1, and probably in that 
of DNMT3A, RNAs bind to the catalytic domain of the 
methyltransferase to inhibit DNA methylation”!”. 

It is notable that several metabolic enzymes — such as 
the glycolytic enzymes lactate dehydrogenase”, glyc- 
eraldehyde-3-phosphate dehydrogenase (GAPDH)*** 
and a-enolase (ENO1)'*”??° — are DRBPs with com- 
petitive DNA- and RNA-binding capacities. In the case 
of GAPDH, DNA and RNA compete for binding of 
the cofactor NAD* to the enzyme’, suggesting that 
Rossmann fold-containing proteins such as GAPDH 
may be sensitive to cellular DNA and/or RNA levels. 
ENO1 binds RNA as a monomer”, which inhibits the 
formation of the catalytically active protein dimer*!”’. 
NAD*-specific isocitrate dehydrogenase, which converts 
isocitrate to a-ketoglutarate, is allosterically inhibited 
by the 5’ untranslated regions of yeast mitochondrial 
mRNA”. Binding of RNA and DNA to metabolic 
enzymes indicates that nucleic acids can regulate the 
function of proteins other than transcription factors to 
modulate cellular metabolism™. 


DRBPs that regulate gene expression at multiple levels. 
Approximately half of the DRBPs we identified in our 
analysis are transcription factors. As discussed above, 
some such proteins have been shown to be the targets 
of decoy RNAs. By contrast, several others bind both 
the DNA and the mRNA of their target genes (FIG. 3b). 
Regulating genes at both the DNA and the RNA lev- 
els allows powerful, combinatorial control over pro- 
tein expression and may enable DRBPs to generate 
both immediate effects (through regulating RNA 
turnover) and long-lasting effects (through regulating 
transcription). 

When activated, the GCR can promote the transcrip- 
tion of anti-inflammatory genes* and repress the tran- 
scription of pro-inflammatory genes***’. Agonist-bound 
GCR destabilizes the mRNA of pro-inflammatory genes 
such as the chemokine (C-C motif) ligand 2 (CCL2; also 
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known as MCP1) gene through direct RNA binding, per- 
haps by the recruitment of ribonucleases*. The presence 
of a GCR-binding motif in many immunogenic mRNAs 
has been confirmed using RNA immunoprecipitation 
(RNA-IP) and suggests that the GCR can accelerate the 
decay of many mRNAs, broadening its known role in 
the anti-inflammatory reponse’. Given that the GCR also 
binds directly to pro-inflammatory transcription fac- 
tors, such as adaptor protein complex 1 and NF-xB*”*, 
it seems that the GCR uses its diverse DNA-, RNA- and 
protein-binding capacities to regulate inflammatory genes 
at the transcriptional and post-transcriptional level. 
Transcription factors can also regulate gene expres- 
sion post-transcriptionally through the regulation of 
microRNA (miRNA) biogenesis. miRNAs are small 
RNAs that facilitate gene silencing through sequence- 
specific pairing to target mRNAs and recruitment of 
these to the RNA-induced silencing complex (RISC; 
reviewed in REF. 41). Several transcription factors have 
been shown to regulate Drosha-mediated primary miRNA 
(pri-miRNA) processing, a key step in the biogenesis of 
functional miRNAs”. SMAD proteins, which are trans- 
ducers of transforming growth factor- (TGF) signal- 
ling, activate transcription by forming a DNA-binding 
heterodimer (reviewed in REF. 43). SMAD proteins 
also increase the levels of several miRNAs, including 
miR-21 (REF. 44), which has important roles in devel- 
opment and immunity”. Surprisingly, the increase in 
miR-21 levels is due not to increased transcription of 
pri-miR-21 but to increased Drosha-mediated process- 
ing of pri-miR-21 to the precursor miRNA (pre-miRNA) 
mir-21 (REF. 44). Bioinformatic analysis identified a con- 
served RNA motif in TGFB-regulated miRNAs, which 
was shown by RNA-IP and EMSA to bind directly to 
the MAD homology 1 domain of SMAD to mediate Drosha 
processing’. Interestingly, the RNA sequence motif that 
is bound by SMAD4 and that mediates the regulation of 
miRNA expression post-transcriptionally is identical to 
the DNA sequences that are bound by SMAD4 and that 
mediate regulation of gene expression transcriptionally’. 
NF of activated T cells 90 kDa (NF90; also known as 
ILF3) is a particularly versatile DRBP that, along with 
its partner NF45 (also known as ILF2), has important 
roles in T cell activation*’. Through the direct binding 
of DNA, mRNA and miRNA, NF90 controls transcrip- 
tion*’, regulates mRNA turnover and translation*”*’, 
and affects miRNA processing”, respectively. These 
functions assist in its role in T cell activation: NF90 
upregulates the mRNA levels of interleukin-2 (IL-2), 
a critical cytokine in T cell development*, by binding 
its promoter to activate its transcription and by stabi- 
lizing the IL-2 mRNA through direct binding to its 3’ 
untranslated region, as was found by EMSA and ribo- 
nucleoprotein (RNP) immunoprecipitation analysis*“*. 
Additionally, using in vitro pri-miRNA-processing assays 
and RNA-IP, NF90 in complex with NF45 was shown to 
inhibit the processing of the pri-miRNA pri-let-7a by 
binding it directly”. let-7a represses IL-6, a cytokine that 
is critical for T cell survival and proliferation®!, which 
may link inflammation to cancer”, and let-7 downregu- 
lation following NF90 upregulation reduces survival in 
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several cancer types****. In summary, these examples 
illustrate that DRBPs can use both transcriptional and 
post-transcriptional mechanisms to serve as potent 
controllers of gene expression. 


Simultaneous binding of DNA and RNA. In contrast to 
DRBPs that target DNA or RNA serially to serve differ- 
ent or related functions, another class of DRBPs binds 
RNA and DNA simultaneously to perform a single func- 
tion (FIG. 3c). Generally, transcription factors not only 
require DNA binding to target promoters but also bind 
to co-repressors or co-activators to affect transcriptional 
regulation. There are several examples of RNA mol- 
ecules acting as co-activators by simultaneously bind- 
ing DNA and various transcription factors. The IncRNA 
rhabdomyosarcoma 2-associated transcript (RMST), 
in particular, is required for binding of neurogenic 
gene promoters and subsequent upregulation by SOX2 
(REF. 55), a transcription factor with important roles in 
development, pluripotency and cell fate”. RNA-IP and 
RNA pulldown experiments showed that RMST interacts 
directly with SOX2 (REFS 55,57), and DNA occupancy of 
SOX2 measured by ChIP followed by sequencing (ChIP- 
seq) was reduced following RMST depletion®*’. The 
IncRNA EVF2 is a transcriptional co-activator of DLX2 
(REF. 58) and recruits methyl-CpG-binding protein 2 
(MECP2) to intergenic enhancers”. A direct interaction 
between DLX2 and EVF2 has been demonstrated by the 
immunoprecipitation of DLX2 followed by reverse tran- 
scription PCR of the EVF2 IncRNA*, and MECP2 also 
has previously been shown to bind RNA®. It should be 
noted that RNA-mediated recruitment of a protein to a 
particular DNA locus might not require direct binding 
of both DNA and RNA by the protein, as ncRNAs could 
recruit transcription factors to a particular DNA locus 
to which the IncRNA is bound. Dual nucleic acid recog- 
nition also facilitates targeted gene repression through 
RNA-guided DNA methylation. This phenomenon was 
first discovered in plants®, and some mammalian RNA 
guides of DNA methylation have since been found”, 
although their mechanisms of action are less clear. 
In mice, DNMT3A forms a complex with Tsix RNA to 
promote methylation of the X-inactive-specific transcript 
(Xist) promoter™. 

Several nuclear receptors — including steroido- 
genic factor 1 (SF1), DAX1 (also known as NROB1) 
and thyroid receptor-a (TRa) — bind simultaneously 
to both gene promoters®® and the RNA co-activator 
SRA (steroid receptor RNA activator) to modulate 
transcriptional activation®’ (FIG. 3c). Using pulldown 
experiments, SF1 and TRa have been shown to bind 
SRA through their hinges, which are flexible, disordered 
regions that connect their DNA- and ligand-binding 
domains“. Knockdown of SRA decreases the inter- 
action of SF1 with protein transcriptional activators and 
the transcription of SF1-regulated genes‘. Several other 
nuclear receptors associate with, but lack direct evi- 
dence for direct binding to, SRA, including the andro- 
gen, progesterone and oestrogen receptors, as well as 
retinoic acid receptor-a (RARa), which may bind SRA 
and its target gene promoters simultaneously”. 
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Homeodomain 

A protein structural domain 
that binds DNA and RNA, 
and that is found most 
commonly in transcription 
factors. It consists of a helix— 
turn—helix structure of 

60 amino acids. 


GAR domain 

A Gly- and Arg-rich motif that 
adopts a repeated ß-turn 
structure. It is found most 
commonly in proteins that 
bind RNA. 


Zinc-finger domains 
Structural motifs that are 
characterized by the 
coordination of one or more 
zinc ions. There are several 
different zinc-finger motifs, 
and each displays a different 
binding mode and structure. 


WW domains 

Small protein motifs of 40 
amino acids that mediate 
specific protein—protein 
interactions with short Pro-rich 
or Pro-containing motifs. 


Ftz-F1 domain 

A protein domain first 
identified in the Fushi Tarazu 
factor 1 (FTZ-F1) nuclear 
receptor. It contains an 
evolutionarily conserved 
LXXLL motif that recognizes 
other LXXLL-related motifs. 


K-homology domain 

A conserved protein domain 
hat interacts with both RNA 
and DNA through a binding 
cleft formed between two 
a-helices, two B-sheets and 
he GXXG loop. 


Cold-shock domain 

CSD). A small (~ 70 kDa) 
domain with high similarity 
o the ribonucleoprotein 1 
RNA-binding motif. This 
domain is found in DNA- and 
RNA-binding proteins in 
bacteria, archaea and 
eukaryotes. 


Interfaces 

The solvent-accessible portions 
of a protein that are capable of 
binding a ligand — including 
DNA and RNA — ina 
competitive or non-competitive 
manner. 


Crosslinking immunoprecipitation has demonstrated 
that RARa can bind to and regulate the translation of 
target mRNAs through a unique RNA-binding motif at 
its carboxyl terminus”. 

Another example of simultaneous DNA and RNA 
binding that is required for DRBP function is the role 
of telomeric repeat-binding factor 2 (TRF2; also known 
as TERF2) at telomeres. Deletion of TRF2 leads to an 
arrest in cell division caused by the formation of chro- 
mosome end-fusions”. Crystal structures have revealed 
that TRF2 binds to telomeric DNA in a sequence-specific 
manner through a C-terminal DNA-binding domain, 
which resembles a homeodomain”. Part of the role of 
TRE2 at the telomere includes the recruitment, through 
its positively charged amino-terminal GAR domain, of the 
origin-recognition complex (ORC, which is a collection 
of proteins that serves as a scaffold for DNA replication 
factors, among other functions”) so that it can assist in the 
maintenance of telomere structure”. Using biotinylated 
RNA pulldown experiments, RNA-IP and EMSAs, the 
GAR domain responsible for ORC recruitment was later 
shown to bind telomere repeat-encoding RNA (TERRA)”*. 
Depletion of TERRA hampers ORC recruitment to the 
telomere without affecting TRF2 binding to the telomere 
itself, suggesting a model in which TRF2 serves as a medi- 
ator between telomere DNA and TERRA, which in turn 
recruits factors required for telomere maintenance”. 


Structural characteristics of DRBPs 

For a protein such as TRF2 to coordinate telomeric DNA 
binding and recruit protein complexes by binding of 
RNA, it must have multiple nucleic acid-binding motifs. 
Some DRBPs, such as the GCR and NF-«B, have main- 
tained domains that are capable of binding both DNA 
and RNA, which allows decoy RNAs to evolve and com- 
pete with DNA for protein binding. We analyse below the 
prevalence of structural domains in DRBPs and discuss 
examples of DRBPs that bind both single-stranded and 
double-stranded nucleic acids. 


DRBP domains that enable DNA and RNA interactions. 
We carried out InterPro domain enrichment analysis 
by DAVID” on our 149 DRBPs to identify domains 
enriched in proteins that bind both DNA and RNA 
(FIG. 2b). The RNA-recognition motif (RRM; also known 
as the RNP domain or RNA-binding domain) was the 
most highly enriched domain in DRBPs (P=2 x 106). 
The RRM is an abundant, short (~100-amino-acid 
long) domain that generally recognizes ssRNA and is 
often present in proteins with other domains, such as 
zinc-finger domains, WW domains or additional RRMs”. 
Such multidomain DRBPs may bind RNA and DNA 
simultaneously through separate domains, as occurs 
with heterogenous nuclear RNA A1, which contains 
two RRMs™. Single RRM-containing proteins are also 
capable of binding both DNA and RNA, and this func- 
tion is present, for example, in RNA-binding motif 3 
(RBM3), TBP-associated factor 15 (TAF15), and 
TAR DNA-binding protein 43 (TDP43; also known 
as TARDBP)*!* (see Supplementary information S1 
(table)). Such bivalent domains may not have the same 


sequence specificity when binding DNA and RNA, high- 
lighting the complexity of recognizing nucleotide bases 
in a sequence-dependent manner. 

Nuclear receptor domains are also enriched in DRBPs: 
RARa binds mRNA through a unique C-terminal 
domain”, SF1 binds the RNA co-activator SRA through 
its hinge and a unique Ftz-F1 domain, and TRa binds SRA 
through its hinge**’. The majority of nuclear receptors 
have two highly conserved Cys, zinc-fingers through 
which they bind DNA, and some nuclear receptors, 
such as the GCR, can also bind RNA through these 
domains’. Other types of zinc-fingers are also enriched 
in DRBPs, such as the RAN-binding protein 2 (RANBP2) 
type. Other notably enriched domains in DRBPs are 
the K-homology domain, the dsRNA-binding domain 
(dsRBD), the cold-shock domain (CSD) and various heli- 
case domains. Each of these domains is capable of bind- 
ing DNA and RNA, and we focus below on the structural 
mechanisms underlying this dual specificity. 


General properties of DRBPs. There are only two chemical 
differences between RNA and DNA. First, RNA (but not 
DNA) has a 2’ hydroxyl (2’OH) group on the ribose sugar, 
which allows an additional hydrogen bond to be formed 
and a greater diversity of secondary structures than is 
possible in DNA. Second, RNA contains uracil rather 
than thymine, as in DNA; uracil lacks a methyl group at 
the C5 position. A comparative analysis of known pro- 
tein-nucleic acid structures revealed that the recognition 
of DNA occurs largely through electrostatics and direct 
base-protein interactions. Conversely, RNA recognition 
by proteins mainly depends on shape complementarity 
and interaction with the 2’OH group*'. Given these gen- 
eral differences, one could expect that, during evolution, 
highly selective protein interfaces would be generated that 
are optimized for either RNA or DNA, with minimal 
cross-binding. However, the most energetically favour- 
able associations between proteins and nucleic acids relay 
on hydrophobic and charge-charge interactions. These 
interactions are less constrained than interactions with the 
sugar backbone or with the nucleotide base edge, which 
is capable of highly-specific Watson-Crick base pair- 
ing. Thus, DRBP domains that competitively bind DNA 
and RNA probably rely on the less specific hydro- 
phobic and charge-charge interactions. For example, 
ssRNA-binding proteins are more likely to form hydro- 
gen bonds with bases rather than with the phosphate- 
sugar backbone, compared to those that recognize folded 
RNA, such as ribosomal proteins and tRNA synthetases™. 
Because ssRNA-binding proteins do not rely heavily on 
sugar recognition, they are more likely to also bind DNA. 
This may explain why the RRM is the most enriched 
domain in the DRBPs included in our analysis® (FIG. 2b). 


The RRM. The RRM is an extremely versatile domain 
that is capable of binding (mainly single-stranded) RNA 
and DNA, as well as proteins”. RRMs preferentially 
interact with nucleic acid bases rather than with the phos- 
phate-sugar backbone. The structural nature of ssRNA 
and ssDNA allows much easier access to the exposed aro- 
matic base faces, as opposed to hydrogen bonding to the 
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Figure 4 | The structural basis for dual DNA and RNA recognition by TDP43 and by the NF-«B subunit p50. 
Protein-RNA structures are shown in blue and protein-DNA structures in green, with protein in the darker shade. 
m-stacking interactions play a prominent part in both the single-stranded DNA (ssDNA)- and the ssRNA-binding activities 
of TAR DNA-binding protein 43 (TDP43). a,b | Phe149 (part a) and Trp113 (part b) within the first RNA-recognition motif 
(RRM1) of TDP43 stack with both RNA and DNA bases. c,d | In the second RRM (RRM2) of TDP43, Phe194 is capable of 
recognizing both uracil in RNA (part c) and thymine in DNA (part d); the additional methyl group at C5 in thymine does not 
contribute to nucleic acid specificity. e | When bound to RNA, both the terminal amine and ¢-nitrogen of Arg227 in RRM2 
of TDP43 contact a 2’ hydroxyl (2’OH) group on the RNA backbone. f | By contrast, these same groups can also make 
contacts with the DNA backbone, both directly and through water-mediated hydrogen bonding. g,h| The p50 subunit of 
nuclear factor-«B (NF-«B) makes strikingly similar base-specific contacts when bound to an RNA aptamer (part g) or to 
double-stranded DNA (part h). This is due in large part to the similar secondary structure and chemical moieties presented 
by the RNA and DNA. Major groove width was calculated with 3DNA software using phosphate-phosphate distances”. 


base edges that occur frequently with double-stranded 
nucleic acid-binding DRBPs. Additionally, stacking 
interactions with the faces of bases are more energeti- 
cally favourable than recognition of the nucleotide edge. 
Therefore, stacking interactions between aromatic pro- 
tein side chains and nucleic acid bases are often observed 
in single-stranded nucleic acid-binding proteins. 
TDP43 is a DRBP that has important roles in mRNA 
splicing and miRNA biogenesis*™*. It contains two 
RRMs, which are separated by a short loop and are both 
capable of binding DNA and RNA. Crystal structures of 
the TDP43 RRMs in complex with DNA and RNA have 
been reported**** (see also the currently unpublished 
structure of TDP43 RRMs), making TDP43 an excel- 
lent case study for dual DNA and RNA recognition by 
RRMs. The DNA- and RNA-bound structures of TDP43 
reveal nearly identical modes of nucleic acid recognition. 
Aromatic side chains, such as Phe149 within the first 
RRM (RRM1), form stacking interactions with DNA or 
RNA bases (FIG. 4a). Trp113, which is part of the more 
flexible loop 1, is able to shift conformations and base- 
stack slightly differently when bound to different nucleic 
acid sequences (FIC. 4b), whereas Phe149 in the rigid B, 
sheet of the RRM1 fold makes similar interactions with 
DNA and RNA (FIG. 4a). Relying on the more energeti- 
cally favourable n-stacking interactions through the planar 


m-Stacking interactions 
Non-covalent interactions 
between two aromatic 
molecules owing to the 
attractive force originating 
from the opposing electrostatic 
potentials between two 
adjacent aromatic amino acid 
residues. 


face of the DNA and RNA bases results in less specific- 
ity than that gained from hydrogen bonding with the 
base edge. Uracil and thymine interact with Phe194 of 
the second RRM (RRM2) in the RNA- and DNA-bound 
structures, respectively (FIG. 4c,d). Despite the additional 
methyl moiety at position C5 in the DNA, no TDP43 
residues recognize the edge of the nucleotide to interact 
or clash with the additional carbon (FIG. 4d). Thus, one 
of the chemical differences between DNA and RNA, 
the use of uracil in RNA, plays no part in nucleic acid 
discrimination in this example. 

By contrast, the RRM2 of TDP43 does make RNA- 
specific contacts with the 2'OH group. The majority of 
protein-2’OH group interactions are mediated through 
protein side chains’, and both the Lys263 and Arg227 
(FIG. 4e) side chains in RRM2 contact a 2'OH group 
when bound to RNA. However, when TDP43 is bound 
to DNA, these same protein side chains contact the 
DNA backbone phosphates (FIG. 4f, demonstrating that 
amino acids are capable of reorienting to allow distinct 
types of interactions to support RNA and DNA binding. 
Nevertheless, DNA binding is not a general property of 
all RRMs. For example, the RRM of poly(A)-binding 
protein relies on many RNA-specific 2'OH contacts for 
RNA interaction, and binding to DNA may be at low 
affinity, if detectable at all”. 
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DRBPs that recognize double-stranded nucleic acids. 
Crystal structures of protein-dsRNA complexes are 
less common than their single-stranded counterparts, 
but there are some examples that are instructive for 
dual nucleic acid recognition. NF-«B is a central tran- 
scription factor of immune signalling and is formed of 
homodimers or heterodimers of Rel family proteins, 
such as p50 (also known as NFKB1) or p65 (also known 
as RELA)**. High-affinity aptamers have been devel- 
oped for both the p50 subunit and the p65 subunit of 
NF-«B, with an affinity of RNA binding that approaches 
that of the transcription factor’s affinity for native DNA 
response elements!*. DNA with an identical sequence 
to the p50-targeting aptamer will not bind p50 (REF. 10); 
p50 is therefore another DRBP that binds RNA and DNA 
in a sequence-specific manner, with different sequence 
specificities for each. 

Crystal structures of p50 bound to DNA and to RNA 
reveal that both bind at the same surface of the p50 
immunoglobulin-like domain™*!. Although p50 binds 
to RNA as a monomer and to DNA as a dimer, similar 
networks of base-specific interactions occur between 
protein and nucleic acids in each structure (FIG. 4g,h). 
Not only do the DNA- and RNA-contacting residues 
of p50 maintain an equivalent position, but both DNA 
and RNA also present similar interfaces for p50 recogni- 
tion in charge distribution and in secondary structure” 
(FIG. 4g,h). This is a seminal, structurally confirmed exam- 
ple of DNA mimicry by RNA to bind to a transcription 
factor and, although the RNA in this case was artificial, 
DNA mimicry has been hypothesized to playa part in the 
endogenous regulation of several transcription factors'*. 

Structures have also been solved of the DRBP dsRNA- 
specific adenosine deaminase 1 (ADAR1), which binds 
both double-stranded Z-DNA and double-stranded Z-RNA 
through its unique Za domain™”*. The ability of the Za 
domain to make sequence-independent interactions with 
the Z-form phosphate backbone of both DNA and RNA 
enables ADAR1 to sense nucleic acid secondary struc- 
ture conformations. Thus, double-stranded nucleotide- 
binding DRBPs can recognize their DNA and RNA 
targets through sequence-specific interactions (in the 
case of NF-KB) or through nonspecific interactions with 
the DNA and RNA backbone (in the case of ADAR1). 


The evolution of DRBPs 

The evolutionary forces driving the structure and func- 
tion of DRBPs are complex, but understanding them will 
help us to identify new DRBPs and perhaps predict their 
susceptibility to interactions with IncRNAs. Although 
the DRBPs identified in our analysis are members of 
many different structural classes, each with their own 
evolutionary history, we focus on members of two very 
dissimilar DRBP families: CSD-containing proteins and 
eukaryotic DNA methyltransferases. CSD-containing 
proteins, which are required to protect cells from low 
temperatures, are members of an ancient DRBP family 
that uses weak selection criteria to interact with nucleic 
acids and therefore intrinsically bind to both DNA and 
RNA. Members of the eukaryotic DNA methyltransferase 
family are DRBPs that have more recently evolved the 


ability to recognize both DNA and RNA: they preferen- 
tially interact with DNA, and only one family member 
(DNMT2) acquired the ability to bind and methylate 
tRNAs”. The discovery of a eukaryotic DNA methyl- 
transferase with a predominant tRNA methylation 
activity and only a modest DNA methylation activity 
showcases how evolution can modify protein surfaces to 
create new functions of DRBPs. 


The ancient CSD DRBPs. The CSD is one of the most 
ancient nucleic acid-binding domains found in bacteria, 
archaea and eukaryotes. All CSD-containing proteins 
bind DNA and RNA (see Supplementary information S1 
(table)). In humans, there are several CSD-containing 
proteins, such as the three Y box-binding proteins, 
LIN28A and LIN28B (which are homologues of the 
Caenorhabditis elegans LIN-28), and CSD-containing 
protein El (CSDE1). Y box 1 (YB1; also known as YBX1) 
was originally named for its ability to bind and repress 
the Y box of major histocompatibility complex class II 
promoters”. YB1 also binds RNA, with roles in alterna- 
tive splicing”, translational control” and RNA stabiliza- 
tion’. In addition, YB1 binds to damaged DNA and is 
involved in the DNA-damage response’ — it trans- 
locates to the nucleus following stresses, such as exposure 
to ultraviolet radiation. 

In bacteria, CSDs exist in short proteins that contain 
one CSD with short flanking sequences. In Escherichia 
coli, there are nine such proteins (cold-shock protein A 
(CspA)-CspI), which are probably products of multiple 
gene duplication events’. Of these, CspA, CspB, CspG 
and Cspl are induced by cold stress, with CspA constitut- 
ing >10% of all protein synthesized during cold shock!” 
10, Simultaneous deletion of these four genes results in 
lack of E. coli colony formation at temperatures at or lower 
than 25°C (REF. 111). CspD is induced by nutrient stress'”, 
but CspC and CspE are constitutively expressed at normal 
growth temperature’!’. Many (if not all) of the Csp pro- 
teins bind DNA and RNA"? and have similar roles to 
those of the human CSD-containing proteins, including 
in maintaining RNA stability ™*, in translational regula- 
tion", in transcriptional control", in DNA replication 
and repair!*"”, and in chromosome folding”. 

CSD-containing proteins are widespread in plants”, 
in which they have similar cellular functions. The first 
Csp-like protein found in plants was wheat CSP 1 
(WCSP1), which is upregulated specifically by cold stress 
and binds ssDNA, dsDNA and RNA homopolymers)”. 
WCSP1 was found to complement the cold-sensitive 
phenotype of the E. coli four-gene knockouts mentioned 
above”, exhibiting remarkable functional conservation. 
In addition, WCSP1 showed nucleic acid melting activ- 
ity in E. coli, which is critical to preventing inappropri- 
ate nucleic acid secondary structures that disrupt and 
terminate transcription. This activity is similar to the 
endogenous E. coli CspA, which also has transcription 
antitermination activity'’. Arabidopsis thaliana has four 
CSD-containing proteins, CSP 1-CSP4, all of which can 
also complement the quadruple csp knockout in E. coli 
to varying degrees, suggesting that their DNA and RNA 
interactions are well conserved during evolution’*"*. 
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Figure 5 | DNA methyltransferases target both DNA and RNA. a | Best known for their 
role in gene silencing, all DNA methyltransferase family members are able to interact 
with both RNA and DNA”, DNA (cytosine-5)-methyltransferase 1 (DMNT1) and 
DMNT3 play a part in initiating and maintaining DNA methylation, whereas DNMT2 
methylates tRNAs. This modification is critical for maintaining tRNA stability and cell 
viability. b | The evolution of the three major DNMTs** is depicted in a cladogram. 
DNMT2 probably diverged from its ancestral DNA methyltransferase to perform a critical 
role in methylating tRNAs, a function which it performs redundantly with NSUN2 

(REF. 132). This radical change in substrate specificity highlights the ability of evolution to 
reshape a DNA-binding interface into one that preferentially recognizes RNA. 


Unlike their counterparts in bacteria, most plant and 
animal CSD-containing proteins have additional func- 
tional domains (including more CSDs), which expand 
their functions, protein-protein interactions and/or 
nucleic acid-binding specificities. For example, the 
human protein CSDE1 has five CSDs, which increase 
the proteins affinity for target RNA sequences”. YB1 has 
both an N-terminal and a C-terminal domain flanking 
its CSD, which can support homomultimerization and 
interactions with many other protein partners (reviewed 
in REF. 128). In addition to its CSD, WCSP1 has three 
CCHC zinc-fingers, through which most of its dsDNA 
binding is mediated’. Nevertheless, the exceptional 
sequence and functional conservation between eukary- 
otic CSD-containing proteins and bacterial Csp proteins 
demonstrate a conserved, ancient role and origin of the 
domain. It is likely that a CSD fold that was capable of 
binding DNA and RNA was present in the last common 
ancestor of bacteria, archaea and eukaryotes’. 


ANALYSIS 


The curious case of DNMT2. DNA methylation has 
important roles in gene expression and repression of 
transposable elements in eukaryotic cells. There are three 
eukaryotic proteins in the cytosine-C5 DNA methyl- 
transferase family: DNMT1, DNMT2 and DNMT3. 
Whereas DNMT1 and DNMT3 play important parts 
in maintaining genome-wide methylation, DNMT2 has 
little DNA methylation activity” and is instead capable 
of methylating tRNAs” (FIG. 5a). When this activity was 
discovered, it was speculated that the three eukaryotic 
DNMITs might have evolved from an RNA methyltrans- 
ferase”®. However, there is no evidence that DNMT2 is 
more closely related to the ancestral protein of the family 
members. In fact, the three eukaryotic DNMTs may not 
be monophyletic and may have evolved from separate 
bacterial DNA methylation restriction-modification 
enzymes'*!. Thus, it seems likely that DNMT2 shifted 
its nucleic acid specificity from DNA to RNA in the last 
common eukaryotic ancestor!*! (FIG. 5b). 

Despite the relatively narrow substrate specificity 
of DNMT2 compared with that of its family members, 
it is highly conserved and is the only extant DNMT in 
some species, such as Schizosaccharomyces pombe and 
Drosophila melanogaster’. This seems to indicate that 
DNMT72 has an important physiological role; however, 
Dnmt2~“ mice are viable and fertile, and yield no obvious 
phenotype”. This apparent contradiction was resolved 
with the recent report that deletion of DNMT2 in addi- 
tion to another tRNA methyltransferase, NSUN2, in mice 
is lethal’. These mice show defects in tRNA stability, 
protein synthesis and differentiation”, implying that 
the DNA methylation activity of DNMT2 is dispensable, 
whereas its tRNA methylation activity is not. 

NSUN2 is a member of the nuclear protein 1 (NCL1) 
family of eukaryotic RNA cytosine-C5 methyltrans- 
ferases, which are broadly distributed among eukary- 
otes'*, Interestingly, NSUN2 itself is a DRBP and is able 
to bind and methylate both tRNA and hemi-methylated 
DNA™. Crosslinking immunoprecipitation-based 
analyses showed that NSUN2 also methylates mRNAs 
and non-coding RNAs’**. Given the distant evolution- 
ary relationship between DNA and RNA cytosine-C5 
methyltransferases'*', NSUN2 and DNMT2 have most 
likely undergone convergent evolution from an RNA- 
binding and a DNA-binding protein family, respectively, 
to ensure proper tRNA modification. These evolutionary 
trajectories have bestowed on both proteins the ability 
(if residual) to bind and modify both DNA and RNA. 
This indicates not only that proteins with evolutionarily 
conserved DNA-binding activities are capable of binding 
RNA (and vice versa) but also that some nucleic acid sub- 
strates may be similar enough in sequence and structure 
to promote binding promiscuity. As mentioned above, 
this phenomenon is exploited by RNAs, both endogenous 
and artificial, that function as decoys to modulate DRBP 
function™!??36, 


Conclusion and perspectives 

In this Analysis, we have demonstrated that DRBPs con- 
stitute a significant fraction of cellular proteins — per- 
haps 2% of the human proteome — and have important 
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Intrinsically disordered 
protein 

A protein that does not have a 
well-ordered three-dimensional 
structure, such as proteins 
containing random coils or 
multiple domains connected 
with flexible linkers. 


Systematic evolution of 
ligands by exponentia 
enrichment 

(SELEX). A technique to 
identify ligand-binding- 
sequence specificity that is 
based on sequential rounds of 
binding of an oligonucleotide 
library to the ligand, followed 
by PCR amplification of the 
bound sequences. 


cellular roles. Their functions include the control of 
transcription and translation, DNA repair, mediating 
responses to stress, splicing and apoptosis. These func- 
tions are intimately linked to the structures of DRBPs: 
orthogonal binding of DNA and RNA provides an 
opportunity for competitive regulation of transcription 
by decoy RNAs, whereas simultaneous binding of DNA 
and RNA permits transcriptional activation by RNA co- 
activators or allows the recruitment of RNA-containing 
complexes to specific DNA loci. In turn, the structures 
underlying DRBP functions are linked to their evolution. 
Some DRBPs contain ancient domains that have long 
bound DNA or RNA; others contain multiple domains 
that separately confer DNA- and RNA-binding abilities 
and mediate their functional roles. 

The majority of RNA-binding proteins have had 
remarkably similar motifs during evolution’”, although 
individual members of protein families, such as the fork- 
head box transcription factors, can have diverse nucleic 
acid sequence specificities arising from independent evo- 
lutionary events’. It is also worth noting that intrinsically 
disordered protein domains that do not fold into defined 
secondary structures may also play important parts in 
mediating nucleic acid binding", as was found for RNA 
chaperones’. In addition to protein evolution, nucleic 
acid sequence evolution has important roles in the devel- 
opment of DRBP function. The discovery of IncRNAs 
illuminated new cellular binding targets for proteins that 
were previously thought of as DNA-specific binding pro- 
teins. Tens of thousands of human IncRNAs have been 
catalogued, and it is likely that many of them have yet- 
undiscovered functions requiring binding to proteins that 
are currently considered as DNA-specific binding proteins 


or that have so far only been shown to bind RNA in vitro. 
For example, the GCR and the oestrogen receptor were 
shown to bind DNA and RNA competitively >20 years 
before a physiological role for RNA-steroid receptor inter- 
actions was established'*""'’. Experimental selection tech- 
niques, such as systematic evolution of ligands by exponential 
enrichment (SELEX), have been used to develop inhibitory 
RNA aptamers for DNA-binding proteins, such as NF-«B. 
If such inhibitory RNA binding is functionally advanta- 
geous, the rapidly evolving sequences of IncRNAs™* 
could provide a platform for the evolution of an analo- 
gous endogenous function, and many DRBPs may have 
species-specific RNA targets. For example, the RNA Lethe, 
which binds NF-«B, exists only in mice and is not present 
even in the closely related rat genome™. 

Proteins that bind both DNA and RNA could have 
several obvious functional advantages. By binding to 
both mRNAs and their encoding promoters, DRBPs 
can exert a powerful, amplified effect on gene expres- 
sion. This also allows greater flexibility in generating 
cellular responses, as these DRBPs could both produce 
rapid effects on protein synthesis and impart long-acting 
changes on gene expression. At a cellular level, using one 
DRBP rather than two independent DNA-binding and 
RNA-binding proteins is more efficient, as it requires the 
transcription and translation of only one gene product. 
Finally, competitive RNA and DNA binding by some 
DRBPs allows an additional level of transcription factor 
regulation through decoy RNAs. These functional advan- 
tages, in addition to the rapid pace at which IncRNAs and 
their functions are being discovered, strongly indicate 
that more DRBPs and DRBP-mediated functions will be 
discovered in the coming years. 
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